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ABSTRACT 

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging 
techniques, operated separately. With the development of modern technologies it is 
possible to obtain datacubes in which one combines both techniques simultaneously, 
producing images with spectral resolution. To extract information from them can be 
quite complex, and hence the development of new methods of data analysis is desirable. 

We present a method of analysis of datacube (data from single field observations, 
containing two spatial and one spectral dimension) that uses PCA (Principal Compo- 
nent Analysis) to express the data in the form of reduced dimensionality, facilitating 
efficient information extraction from very large data sets. PCA transforms the sys- 
tem of correlated coordinates into a system of uncorrelated coordinates ordered by 
principal components of decreasing variance. The new coordinates are referred to as 
eigenvectors, and the projections of the data onto these coordinates produce images 
we will call tomograms. The association of the tomograms (images) to eigenvectors 
(spectra) is important for the interpretation of both. The eigenvectors are mutually 
orthogonal and this information is fundamental for their handling and interpretation. 
When the datacube shows objects that present uncorrelated physical phenomena, the 
eigenvector's orthogonality may be instrumental in separating and identifying them. 
By handling eigenvectors and tomograms one can enhance features, extract noise, 
compress data, extract spectra, etc. 

We applied the method, for illustration purpose only, to the central region of the 
LINER galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not 
known before. Furthermore we show that it is displaced from the centre of its stellar 
bulge. 

Key words: Methods: data analysis - methods: statistical - techniques: image pro- 
cessing - techniques: spectroscopic. 



1 INTRODUCTION 

Throughout the 20 th Century, astronomy has developed 
through the use of imaging and spectroscopic techniques, 
analysed independently. Extracting information from these 
types of data, requires relatively simple tools. With the ad- 
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vent of panoramic spectroscopic devices such as Integral field 
units - IFUs - and Fabry-Perot spectrographs, it is possible 
to construct datacubes of immense proportions that present 
data in three dimensions: two spatial and one spectral. The 
analysis of these data may become complex and overwhelm- 
ing, as it may involve tens of millions of pixels. More concern- 
ing is that, given this complexity, only some restricted sub- 
set of the data ends up being analysed (kinematical maps, 
line flux ratios, extinction and excitations maps, etc.); the 
rest is at the risk of being largely ignored. New techniques 
that allow us to extract information in a condensed, fast and 
optimized form are therefore necessary and welcome. 

In this paper we present a method of datacube interro- 
gation that uses Principal Component Analysis - PCA. This 
method condenses the significant information content associ- 
ated with the data, through effective dimensional reduction, 
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facilitating its interpretation. PCA compresses the data ex- 
pressed as a large set of correlated variables in a small but 
optimal set of uncorrelated variables, ordered by their prin- 
cipal components. Clearly our shared goal of analysing data 
is to extract physical information from them; a dimensional 
reduction does not necessarily produce valuable information, 
but an appropriate choice of coordinates may help. PCA is 
a non-parametric analysis. This means that there are no pa- 
rameters or coefficients to adjust that somehow depend on 
the users experience and skills, or on physical and geometri- 
cal parameters of a proposed model. PCA provides a unique 
and objective answer. In the traditional scientific method 
one formulates questions and looks to the data for answers. 
In this new strategy, PCA produces the answer; the user's 
challenge is to interpret the results. This process is not al- 
ways difficult, but often plain of subtleties. 

PCA has been used many times in the astronomical 
literature. For instance, iDeemingl (|l964l ) used this tech- 
nique to analyse a nd classif y stella r spectra; this approach 
was improved by IWhitnevI |7983). Ap plications to mod- 
ern st ellar s pectroscopy can be found in iBailer- Jones et al.l 
l|l998h and iRe Fiorentin et al.l (120071). The tec hnique was 
also used for morpho l ogical ( Lahav et al. I [19961 ) and spec- 
tral (|Sodre fc Cuevaa 1 19971 ') classification of galaxies and 
QSOs (|Borosonll2003 ). Images of supe rnova remnants hav e 
been analysed with PCA technique (| Warren et al.l 120051 ). 
A more extended pres entat ion of this techn ique is given in 
iMurtag fc Heckl (| 19871 ) and iFukunagal (|l99d ). 

Most of the applications of PCA in astronomy are re- 
lated to find eigenvectors across a population of objects. In 
the present case we want to apply the technique to a single 
datacube in which the objects are spatial pixels of an indi- 
vidual field, containing a single galaxy, nebula or a set of 
stars. We identify eigenvectors (the uncorrelated variables) 
that we refer to as eigenspectra, and tomograms, which are 
images of the data projected in the space of the eigenvec- 
tors. In traditional tomographic techniques one obtains im- 
ages that represent "slices" in tridimensional space (the hu- 
man body, for example) or in velocity space (Doppler To- 
mography). In PCA Tomography one obtains images that 
represent "slices" of the data in the eigenvectors space (to- 
mograms). The good news is that each tomogram has asso- 
ciated with it an eigenspectrum. The simultaneous analysis 
of the eigenspectra and associated tomograms brings a new 
perspective to the interpretation of both. 

With the aim of illustrating the PCA Tomography 
method, we have applied it to a Gemini GMOS-IFU 
datacube of the nuclear region of the nearby LINER 
galaxy NGC 4736 (M94). The LINER characteristics of 
NGC 4736 are considered to be related to an atypical 
population of stars, as it is an aging starbu rst galaxy 
(|Eracleous et al.ll2002l ; lcid Fernandes et al.ll2004l ). Applying 
the PCA methodology, we show that it has a bona fide type 1 
Active Galactic Nucleus - AGN - displaced from the centre 
of its stellar bulge. 



define a spatial pixel and A a spectral pixel. We will assume 
that the datacube has n = /iXf spatial pixels and m spectral 
pixels. The mean intensity of all spatial pixels for a given A 



(i) 



i=l j = l 



Q\ being the average spectrum of the datacube. The inten- 
sity adjusted to the mean is 



lijX = (hjx)o 



(2) 



It is important to note at this point that all emission 
with null variance across the spatial pixels (for a given wave- 
length or spectral energy) are incorporated into the mean 
and subtracted out. This is the case, for instance, for the 
sky emission that is constant over the field of view (FoV) . 

Now we organize the new datacube (which has zero 
mean) into a matrix I^a of n rows (spatial pixels, referred 
to here as objects) and m columns (spectral pixels, referred 
to here as properties). Then f3 can be expressed as 



/3 = /i(i-l)+j 



(3) 



The datacube transformed into the matrix I^a will be 
the subject of the PCA Tomography method. 



3 ELEMENTS OF PRINCIPAL COMPONENT 
ANALYSIS - PCA 

Principal Component Analysis (PCA) is a technique used 
to analyse multidimensional datasets. Its is a quite efficient 
method to extract information from a large set of data as 
it allows us to identify patterns and correlations in the data 
that in other ways would hardly be noticed. Mathematically 
it is defined as a linear orthogonal transformation that ex- 
presses the data in a new (uncorrelated) coordinate system 
such that the first of these new coordinates, E\, (eigenvector 
1) contains the largest variance fraction, the second variable, 
E2, contains the second largest variance fraction and so on. 
These new coordinates generated by the PCA are, by con- 
struction, orthogonal to one anoth er. For a more detaile d 
description of th e PCA method. seelMurtag fc Heckl (1 19871). 
IFukunagal l| 199(f ). I Johnson fc Wickernl (|l998T ) and lHair et all 
(| 19981 ). 



In many PCA implementations normalization is done, 
so that variance is uniform (and generally unity) within the 
data. We will not adopt this strategy as we are interested in 
retaining the relative spectral line intensities. Therefore we 
will analyse the covariance matrix and not the correlation 
matrix. 

The covariance matrix of \p\ can be expressed as 



/3AJ 



f3\ 



(4) 



2 FROM A DATACUBE TO A DATA MATRIX 

Our aim is to analyse datacubes in which we have two spatial 
and one spectral dimension. Each pixel of this original three- 
dimensional datacube has intensity (Iij\)o ', here i and j 



The matrix C CO v is square and has m rows and columns 
(equal to the number of the original spectral pixels). The 
covariance matrix has some relevant properties. One is that 
it is symmetric, 

Ccov — [Ccov] (5) 
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The main diagonal elements correspond to the variances 
of each of the isolated variables, while the other (cross) el- 
ements correspond to the covariance between two distinct 
properties. The m x m covariance matrix has m eigenvec- 
tors, Ek, each one associated with one eigenvalue, A*. -Bl- 
are the new uncorrelated coordinates and k is the order of 
the eigenvector that can vary from 1 to m; the eigenvectors 
are ordered by decreasing value of each associated , which 
is the variance of each component, to form the characteristic 
matrix, E\k, in which columns correspond to eigenvectors. 
Note that, in order for all eigenvectors to be defined, we 
require that n ^ m. 

The transformation that corresponds to the PCA can 
be represented by the following formula: 



0k = I/3A ■ E-Xk 



(6) 



where Tpk is the matrix containing the data in the new 
coordinate system. 

As the aim of PCA is to express the original data on the 
new system of uncorrelated coordinates, one concludes that 
the ideal covariance matrix of the data in this new coordi- 
nate system (D cov ) must be diagonal, that is, the covariance 
between the coordinates must be zero. One may say that the 
PCA execution consists in determining the matrix Exk that 
satisfies equation [6] and so that D cov is diagonal: 



i 0k\ 



■ T 



Ilk 



1 



The diagonal elements of D cov are the eigenvalues. 



(T) 



4 EIGENSPECTRA AND TOMOGRAMS 

In the case of datacubes of astronomical interest, it is usual 
to have two-dimensional images with spectra associated with 
each spatial pixel. In calculating the PCA of such datacubes, 
one obtains eigenvectors as a function of wavelength, energy 
or frequency (properties) , that we will also refer to as eigen- 
spectra. 

On the other hand, T@k represents data in a new coor- 
dinate system. As our objects are spatial pixels, their projec- 
tion onto a given eigenvector may be represented as a spatial 
image. Each column of Tpk can now be transformed into a 
two-dimensional image, Ty/t, using equation [3] We will refer 
to these images Tijk as tomograms, since they represent 
"slices" of the data in the space of the eigenvectors. 

When a stellar-like feature is present within the FoV, 
contiguous pixels tend to be correlated as the signal is in- 
fluenced by the spatial Point Spread Function (PSF). Real 
structures have, thus, a minimum scale given by the PSF, 
usually determined by the seeing or intrinsic spatial instru- 
mental resolution. 

Analysing tomograms simultaneously with eigenspectra 
brings together a wealth of information. Spectral charac- 
teristics may be identified with features in the image and 
vice-versa. Interpreting such associations facilitates the un- 
derstanding of the three-dimensional structure within the 
datacube. In section [7] we will see an application of this and 
its potential will become clear. 



5 RECONSTRUCTION, COMPRESSION, 

COSMETICS AND FLUX CALIBRATION OF 
DATACUBES 

It is, of course, possible to reconstruct the original datacube 
from all the eigenvectors and tomograms. It is also, how- 
ever, possible to partially reconstruct the datacube using 
only those eigenvectors and tomograms that contain inter- 
esting or relevant information, ignoring those that contain 
noise. It is not straightforward to know where the signal 
stops and the noise becomes d ominant. The Kayser crite- 
rion (IJohnson fc Wickerdll998h suggests that the limit is 
the mean eigenvalue. This criterion seems to sel ect too few 
eigen vectors. One can, else, use the "scree test" (|Hair et al.l 
1 1998f ) which is illustrated in Fig. [1] In practice, the number 
of relevant eigenvectors depends on the number of uncorre- 
lated physical phenomena represented in the object. There is 
no way to know a priori; each case must be examined by the 
user and the actual delineations to be considered depends on 
his skills and predilections. Let us reconstruct the datacube 
taking as a characteristic matrix the set of all eigenvectors 
that have relevance until k = r, ignoring all others. In this 
case the reconstructed matrix I'/3a(^ r) is 



I'/3a(sS r-)=T /3fe «r)-[E Afe « 



(8) 



where E^fc^ r) is the characteristic matrix with columns 
corresponding to eigenvectors until k = r and Tpki^ r) 
is the data matrix in the new coordinate system contain- 
ing eigenvectors only to k = r. From the matrix I'^a^ r) 
one can reconstruct the datacube I'ijxi^ r). The datacube 
r) contains many more data (pixels) than do Ea(s(^ 
r) and T pk(^ r), even if it does not contain more informa- 
tion. Note that for data transmission, it is much faster to 
send Ea(c(< r) and T/3fe(< r) than I^xi^ r )i which can be 
reconstructed using equation [5] This form of data compres- 
sion has practical applications, for example in data trans- 
mission. 

Remembering that the average spectrum, Qx, was sub- 
tracted from the original data (see equation[2]), it can now be 
added to the reconstructed datacube, to recover calibration: 



tfwA«r))o=4 A « r) + Q\ 



(9) 



In this case, the reconstructed datacube does not have 
the the variance (presumably mostly noise) contained in 
eigenvectors r < k < m. 

Recall now that the eigenvalue A*, can be expressed as 



A fc = 



[Tf3 k (k)] T •!>»(*) 



(10) 



where Tpk(k) is the matrix containing only the column cor- 
responding to the projection of the data on Ek- The sum 
of the variance contained in eigenvectors r < k < m, or 
"noise" (this sum could still contain also some "signal" ) may, 
thus, be evaluated as a, in the rms sense, between images 
(I'ijxi^ r ))o and (Iijx)o and may be expressed as 



k — m 

E 



A* 



(11) 



One may also reconstruct the datacube of a single eigen- 
vector Ek- In this case, 



\' x(k)=T k(k) ■ [Exk(k)] 1 



(12) 
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where E^fc(A:) is the matrix containing the column corre- 
sponding to the eigenvector Ek- From the matrix l'p\(k), 
one can reconstruct the datacube l' i:jX (k), that presents 
the original dimensions but contains information from the 
object-eigenvector Ek only. 

It is quite common to have cosmetic problems in the 
datacubes. This can happen, for example, as incomplete re- 
moval of cosmic rays and hot/cold pixels. In this situation 
the "defect" may appear as a specific eigenvector or corre- 
lated with some set of other properties. It is usually easily 
detected and can be removed in the appropriate eigenvector, 
by excluding or by correcting it. 

Flux calibration of a reconstructed datacube can be 
done by adding the average spectrum, Q x (as shown in equa- 
tion [9]). In general this process can only be done if the dat- 
acube is reconstructed with all its components. If we want to 
ignore the noise, then we are incorporating a small often neg- 
ligible error. If the average spectrum has two components, 
say a stellar and a line emitting component, then one could, 
in principle, separate the two and, by separating them in 
the reconstructed datacube, calibrate both. The final result 
is additive. In section [7] we will see an application of such a 
procedure. 

When the datacube is reconstructed, it may have a spa- 
tially defined field without any object, representing back- 
ground only; because the initial average spectrum was sub- 
tracted, this field should present a negative signal. One way 
to fix this is by adding the average spectrum, as seen above. 
But, sometimes, this is not desirable, for example, when the 
average spectrum contains sky emission. In this case it might 
be better to calculate the average spectrum from the back- 
ground (this average spectrum is not affected by sky emis- 
sion, because it was obtained from eigenvectors), multiply 
it by —1 and add it to the entire cube. This way we ensure 
that the spectrum of the background is reset to zero. This 
procedure may be particularly useful for sky subtraction in 
data obtained with Fabry- Perot. 
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Figure 1. The "scree test" applied to the first 15 eigenvalues of 
the NGC 4736 datacube. One can see that the graph levels off for 
eigenvetors above 7. Eigenvalue 1 is out of scale. 



where (EAfe)r corresponds to the matrix Eajc that had each 
of its columns (that correspond to each of the eigenvectors 
Ek) multiplied by the corresponding Tfe factors. From the 
matrix Vp\{A) one can reconstruct the corresponding dat- 
acube I' i:jX (A). Now one can project the intensities on the 
dimensions ij, showing explicitly the image with the sup- 
pression or enhancement object "A". The spectrum of the 
enhanced object could also be extracted; such an example is 
shown in Fig. U 

We can follow this with an alternative or parallel strat- 
egy. Instead of adding the intensities (as in equation ll4|l . one 
can add the intensity associated to each eigenvector divided 
by its variance. To do this, we first multiply each column 
of the matrix Tpk (which correspond to each of the tomo- 
grams) by the factor Nk, given by 



N k = 



6 FEATURE SUPPRESSION AND 
ENHANCEMENT 

In order to suppress or emphasize the properties of a given 
feature "A" (defined by its image or spectral characteristics), 
we construct a feature factor F^ (A) , for each eigenvector Ek , 
such that 



Tk(A) = 1;0 



(13) 



depending on whether eigenvector k is to be suppressed (0) 
or not (1) - and this is a user-chosen value. Feature "A" may 
be a star, a galactic nucleus or, else, a spectral class or a 
feature such as the Broad Line Region - BLR - of an Active 
Galactic Nucleus. With such a strategy we can reconstruct 
a datacube in which the desired feature is suppressed or 
enhanced: 



i' ijX (A) = J2[iUk)-n(A)] 

k 

or obtaining directly 
l' /3 A(A) = T /3fe -[(E Afe )r] T 



(14) 



(15) 



(A*)*-(n-l) 



(16) 



where n is the number of spatial pixels in the image. Nk 
corresponds to a normalization factor. A consequence of the 
factor (n — 1) is that the sum of the square of all spatial 
pixels is 1. Then we define 



V' /3A (A) = (T /3fe )iv[(EA fe )r] 



(17) 



where (T^fe)jv corresponds to the matrix with unit vari- 
ance, that is, with each of its columns (that correspond to 
each tomograms) multiplied by the corresponding Nk fac- 
tors. It is important to note that each tomogram has zero 
mean. From the matrix \/'p\(A) one can reconstruct the 
corresponding datacube V'j X (A). The difference is that in 
the case of I[j X one emphasizes the intensity component of 
each eigenvector while in the case of V(j X one has the dis- 
tinct characteristic of all eigenvectors with the same weight. 
The advantage of V^ x with respect to I[ _ A is that the former 
shows more "colorful" features, enhancing the many char- 
acteristics of all eigenvectors, but it may also enhance the 
noise as it gives similar weight to all isolated eigenvectors. 
Fig. [5] shows such an example. 
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Figure 2. The projection of the average intensities < (hj)o > 
for each A of the original datacube onto the spatial pixels ij. The 
bulge has a comet-like morphology. 

7 APPLICATION: THE CENTRAL REGION 
OF THE LINER GALAXY NGC 4736 

Let us illustrate the application of the method of PCA To- 
mography to a particular case. We will attempt to answer 
the following question: is there a supermassive black hole in 
the nearby LINER galaxy NGC 4736? LINERs are a class 
of objects with diverse nature (|Heckmanl [TTni : : . Although 
most of them seem to host an AGN in the sense that they 
are powered by accretion onto a supermassive black hole, 
some objects have not shown any evidence of this. NGC 
4736 is somewhat peculiar because it presents a stellar pop- 
ulation that corresponds to an aging starburst. Could this 
explain its LINER nature? See lEracleous et al.l (|2002f ) and 
ICid Fernandes et al.1 i|2004h for a more detailed discussion. 

7.1 The data 

In an attempt to solve the puzzle associated with this galaxy, 
we observed it with the G e mini Multi Object Spectro- 
graph (CMOS - iHook et aD (|2004T l. lAUington-Smith et al.1 
|2003)), operated in the Integral Field Unit (IFU) mode. 
The data were obtained on 2006 June 23 with the Gem- 
ini North Telescope. The datacube was obtained using 500 
fibers on the object and 250 fibers on sky, 1 arcminute away. 
The spectral resolution was R=2900 covering from 4700 to 
6800 A. Three 20 minute integration were obtained. 

The sky fibers actually observed the inner ring of the 
galaxy, as there was no other way to position them. For 
this reason these sky observations were not used and the 
datacube we analyzed did not have any kind of sky sub- 



traction. In this situation PCA analysis is still possible as 
sky has no spatial variance and is incorporated in the av- 
erage spectrum, being removed from the cube right in the 
beginning. Two strong telluric emission lines can be seen in 
the average spectrum (Fig. [3} and present no sign in any of 
the eigenvectors or reconstructed I[j X or V-j X cubes. Only 
when dealing with flux calibration some special care must 
be taken, and in sections 17.41 and 17.51 we show that this is 
still possible. 

Comparison CuAr lamps, flatfields, twilight flats and 
bias images were taken to reduce and calibrate the data. 
The data reduction was done with the IRAF package us- 
ing the gemini.gmos task package that handles the bias 
and background subtraction, cosmic ray rejection, CCD and 
fiber sensitivity correction, wavelength and flux calibration 
and construction of the datacubes. Our final scientific dat- 
acube was extracted with a spatial oversampling of 0.05 arc- 
sec pixel -1 (4x4 data pixel per fiber) compared to the real 
spatial resolution as determined by the 0.55 arcsec seeing 
experienced at the time of observations. The datacube has 
6,200 spectral pixels, with 0.34 A pixel -1 spectral sampling. 

As the GMOS atmospheric dispersion corrector was not 
operational, the differential atmospheric refraction was ap- 
preciable, giving wavelength distortions through out the dat- 
acube . To evaluate this, we used the formula from lFilippenkol 
1 19821 ) and applied our own a lgorithm for differential atmo - 
spheric refraction correction l|Steiner et aljlin preparation! ) . 
This algorithm corrects each pixel for the atmospheric differ- 
ential refraction to an accuracy of about I /20 of the seeing 
disk. 

A R i chard s on-Luc y deconvolution algorithm 

jRichardsonl Il972l ; iLucvl If974l ) was applied to all im- 
ages in the datacube using 6 iterations. This procedure has 
two effects: it sharpens the PSF while suppressing the high 
frequency noise. If the number of iterations is too small, 
these improvements are negligible while, if the number is 
too large, low frequency noise is introduced. We found 
that ~ 6 iterations was a good compromise; the delivered 
PSF after deconvolution reduced the FWHM of the PSF 
by a factor of 1.4. The adopted instrumental PSF for the 
purposes of the deconvolution was gauged from the spatially 
compressed image as a Gaussian having a FWHM of 0.47 
arcsec. This type of deconvolution is compatible with PCA; 
we have experienced these procedures with dozens of cases, 
involving datacubes of galaxies, nebulae and stars, with 
good results. 

In this paper we will analyse only the data correspond- 
ing to the GMOS red CCD, with a wavelength range from 
6179 to 6848 A and 1976 spectral and 5170 spatial pixels, 
after trimming some of the borders because of the atmo- 
spheric diff erential refract i on correction. T he full data set is 
analysed in lSteiner et al.l |in preparation!) . 

7.2 Eigenvectors, Tomograms and Eigenvalues 

We are ready to perform the PCA Tomographic analysis 
of the datacube. Before doing so, we have subtracted the 
average of all spatial pixels, Q\, for each wavelength pixel. 
The original datacube has its spatial projection shown in 
Fig. [2] while the average spectrum (see equation [TJ is shown 
in the top diagram of Fig. [3] 

How many eigenvectors do we want to work with? Ap- 
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Figure 3. The average spectrum (top diagram), as defined in equation [l] the stellar (middle diagram) and gaseous (lower diagram) 
components of the average spectrum were obtained by subtracting a scaled emission line (could also have been a stellar) template. See 
sections I7.4I and 17. 51 for how this is done. 



plying the scree test (see Fig. [TJ one can see that the rele- 
vant eigenvectors are limited to the first seven. It is always 
good to examine the eigenvectors/tomograms, case by case; 
in the present example the features associated to gas emis- 
sion seem to be present in the first seven, disappearing in 
the noise for eigenvectors of higher order. This confirms the 
conclusion from the scree test. However, this does not mean 
the information about the stellar population is not encoded 
in eigenvectors of higher order. In dealing with the stellar 
component, one should keep this in mind. In what follows 
we are interested in the features associated to the emission 
lines and will limit ourselves to the first eight eigenvectors. 

The eight principal components are shown as eigenvec- 
tors and tomograms in Appendix A and their eigenvalues, 
in Table [1] As can be seen, eigenvector 1 contributes 99.74 
per cent of the variance. This means that this eigenspec- 
trum basically replicates what one would see in a spectrum 
obtained with traditional spectroscopic techniques. A close 
comparison with the average spectrum (Fig. [3j confirms this. 
Tomogram 1 is the image comparable with that of a classic 
central stellar bulge. Although the eigenspectrum looks like 
a standard spectrum, it is not; the scale is not associated to 
intensity. 

Eigenvector 2 contributes 0.088 per cent of the variance 
and displays, in combination with its tomogram, a clear map 
of the rotation of the emission line gas in the FoV. It is also 
clear from its tomography that this eigenvector is uncorre- 
cted with the stellar component. 

Eigenvector 3 contributes 0.032 per cent of the variance. 
Its characteristic is that it displays correlations among fea- 



tures that can be associated to emission line transitions. It 
is quite surprising that features related to two kinds of emis- 
sion lines are visible: narrow lines, associated with the [Ol], 
[Nil] and [Sn] species and, also, Ha. But there is also a fea- 
ture associated to a broad Ha component. This component 
is typical of Seyfert 1 (or LINER type 1) galaxies and is usu- 
ally taken as a clear evidence for an AGN associated with 
a supermassive black hole. This is, therefore, an important 
discovery, which has never been reported before, despite the 
fact that this is a nearby galaxy. The broad lines associated 
to such features are emitted in the Broad Line Region - BLR 
- while the other, narrower, lines are emitted in the Narrow 
Line Region - NLR. Features associated to [O i] lines are 
also present in eigenvector 3, as they are in E2, however, 
they were not visible in E\ . 

Eigenvector 4 and its respective tomogram (contribut- 
ing 0.013 per cent of the variance) shows again a correlation 
among the narrow line features, but this time it is anti- 
correlated to the broad Ha. Notice that in both E3 and E4 
the emission lines features are correlated with the continuum 
in a complementary way. Eigenvectors 5 and 6 show corre- 
lations between narrow line features but involving distinct 
line widths. 

One could attempt to interpret all eigenvectors up to 
the limit of 1976, the number of wavelength pixels (proper- 
ties) in the datacube analysed here. However it is clear that 
the eigenvalues become smaller as the relative noise level of 
the eigenvector increases. In the present case the eight prin- 
cipal components explain 99.8979 per cent of the variance 
(see Table [1} remaining 0.1021 per cent of the variance con- 
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Table 1 . Eigenvectors and corresponding eigenvalues for the first 
eight principal components. 



Eigenvector 


Eigenvalue 


Accumulated fraction 


E k 


(% of the variance) 


(% of the variance) 


El 


99.7443 


99.7443 


E 2 


0.0883 


99.8326 


E 3 


0.0325 


99.8651 


E4 


0.0129 


99.8781 


E 5 


0.0084 


99.8864 


E 6 


0.0048 


99.8912 


E T 


0.0039 


99.8952 


E 8 


0.0027 


99.8979 



tained in the other 1968 eigenvectors. Eigenvectors 7 and 
8 are the last to be shown here (Appendix A). Ej is still 
dominated by broad Ha, but the noise level clearly becomes 
strong and competes with any signal after Eg (see Fig [1} ■ 

7.3 The Broad Line Region: location and 
spectrum 

As mentioned earlier, obtaining the eigenvectors and to- 
mograms is an objective process that does not depend on 
choices made by the user. However, by handling eigenvectors 
and tomograms, one can express aspects that do depend on 
the user's desires and skills. We will explore such aspects in 
the following. 

It is clear that NGC 4736 has an AGN and that this 
AGN has a BLR. The question now is how to enhance this 
feature. This is a relevant question not only in the study of 
the properties of this emitting region but also determining 
the location of the AGN and, thus, the location of a su- 
permassive black hole. As already mentioned in section [6] 
enhancing a feature "A" can be done by attributing the fea- 
ture factor Fk.(A) to each eigenvector k, thus reconstructing 
the datacube. This can be done in two ways: equations [14] 
and [TS] provide the intensity cube, I' i: j X (BLR); alternatively, 
by attributing the factor Nk (equation I16|) to each tomo- 
gram, one can construct the datacubes normalized to unit 
variance, Vlj X (BLR) (see equation 1 17|l . This was done with 
the feature factors from Table [2] 

From these reconstructed datacubes, we obtained the 
spectra and images of the BLR. The spectra of the BLR 
were extracted from a circular region centred on the AGN 
and with a radius of 0.2 arcsec. The BLR images were ob- 
tained with "narrow filters" obtained by adding consecutive 
images in wavelength pixels, centred on the red wing of the 
broad Ha feature (Figs. [4] and |3J. These images map the lo- 
cation of the BLR in the FoV and, thus, the position of the 
supermassive black hole. 

By observing Figs. [4] and [5] one can note that, although 
presenting a lower signal-to-noise, the spectrum extracted 
from the cube V-j X (BLR) does a better job of separating 
the BLR from the NLR. This is as expected (see section [6} 
since all principal components enter with the same weight. 
As the Tomogram of the principal component 1 (see Ap- 
pendix A) represents the image of the stellar bulge, one 
can, now, superpose the image of the BLR (Fig. 3} onto 
the stellar component. This is shown in Fig. [6] An interest- 
ing and surprising discovery is that the BLR, that locates 




Figure 6. The location of the galactic stellar bulge (tomogram 1 
is in green) and the BLR (image from Fig. [4] is in yellow) of the 
galaxy NGC 4736. The AGN is displaced by 0.15 arcsec from the 
centre of the galactic bulge. This corresponds to 3.5 parsec (~ 10 
light years). Notice the bulge's comet-like shape. These features 
are probably con sequences of galaxy merger th at occurred a few 
billion years ago l|Steiner et alJIm preparation! ) . 




1 

6200 6300 6400 6500 6600 6700 6800 
Wavelength (A) 

Figure 8. The AGN flux calibrated spectrum of the LINER 
galaxy NGC 4736. 
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Figure 4. The image and spectrum of the BLR, extracted from the reconstructed datacube I' i ^(BLR). The image is obtained from 
a "narrow band" (by adding consecutive images in wavelength pixels), centred on the red wing of (the broad) Ha; it, thus, maps the 
location of the BLR. The spectrum was extracted from a circle of 0.2 arcsec centred at the bright spot in the image. Notice that the 
broad Ha emission is redshifted with respect to the rest frame (defined by the narrow lines). 




Figure 5. The image and the (smoothed) spectrum of the BLR, extracted from the reconstructed datacube V^ X (BLR) in a similar 
way as in Fig. [4] In addition to the broad Ha, it is possible to see an asymmetric tail to the red. The vertical dashed line represents the 
wavelength of Ha at the rest frame of NGC 4736. 



Table 2. The feature factor for the BLR. 



Eigenvector 


V k {BLR) 


Ei 





E 2 


1 


E 3 


1 


E 4 


1 


E 5 





E 6 





E 7 


1 


Es 






the hypothetical supermassive black hole, is not positioned 
at the centre of the galactic bulge. This lack of positional 
coincidence is unexpected (to say the least) and certainly 
has important consequences for the study of this galaxy 
(|Steiner et alJIin preparation! ) . 



7.4 The stellar and gas emitting components of 
eigenvector 1 

Eigenvector 1 is dominated by the correlation among the 
spectral properties of the bulge stars and gas emission. 
Eigenvectors of higher order are basically dominated by cor- 
relations of gas-emitting properties only. Could we create 
two datacubes, from eigenvector 1, the first representing the 
stellar continuum and the second, the gas line emission? To 
attempt this we proceeded in the following way: we took a 
NLR representative spectrum from the reconstructed dat- 
acube I[j X (this representative narrow emission line spec- 
trum is shown in Fig. [7} as template and scaled this template 
so to match the [N n] 6583 A line intensity in E\ . Subtracting 
this scaled template from E\ leaves us with the stellar com- 
ponent of Ei. These two (stellar and gaseous) components 
of Ei are shown in Fig. [7] From these two vectors (Ei a and 
Eib) we reconstructed the respective datacubes using equa- 
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Figure 7. The stellar (E\ a ) and gaseous (En,) components of eigenvector Ei; the sum of these two components is equal to E\ and each 
of them can be obtained by adopting a template from the other. 



tion ll2l By adding now the En, datacube to all others, from 
2 to 10, we obtain the final cube containing the emission 
lines, Ilj X (gas). We can now study the morphological and 
spectral properties of the line emitting gas. 



7.5 Flux calibration 

Flux calibration of a reconstructed datacube can be recov- 
ered by adding the average spectrum (as shown in equa- 
tion [9}. In general this process can only be done if the cube 
is reconstructed with all the components. Here, however, we 
neglected all eigenvectors above 10, as they essentially rep- 
resent noise and add up to a tiny fraction of variance. But 
we are interested here in calibrating both datacubes of gas 
line emission and stellar component separately. To do this 
we proceed in the following way: using the spectrum of the 
gaseous component of Ei (shown in Fig. [7J) as a template, 
we decomposed the average spectrum into its stellar and 
gaseous components in a way similar to the method used 
for eigenvector E\ in the previous section. These two com- 
ponents are shown in Fig. [3] (the two narrow telluric lines 
seen in the upper diagram were removed "by hand" in the 
lower diagram). Then the stellar component of the average 
spectrum was added to the datacube obtained from the E\ a 
component of eigenvector 1. Similarly, the gaseous compo- 
nent of the average spectrum was added to the datacube 
reconstructed from vector En, and all others, from 2 to 10, 
obtaining now a flux calibrated gaseous cube (I' i:jX (gas))o- 
This decomposition of the average spectrum and their ad- 
dition to the respective datacube can be done because the 
stellar and gaseous components add up linearly; so we end up 



having two flux calibrated datacubes in such a way that, if 
added together, they form the original calibrated datacube, 
except for the discarded noise. 



7.6 Extracting the AGN spectrum 

Finally we can extract the flux calibrated AGN spectrum 
from the flux calibrated cube (I^ x (gas))o- Notice that, both 
for constructing this cube and for calibrating it, PCA was 
crucial. As we know the location of the AGN (from Figs. [4] 
and [5]), the extraction can be made taking a circular aper- 
ture of radius 0.5 arcsec (Fig |SJ. The flux of the broad 
component of Ha is ~ 2.14 x 10 -13 erg s _1 cm" 2 . This 
corresponds to a luminosity of the broad Ha component of 
L ~ 6.14 x 10 38 erg s _1 . This luminosity is similar to that 
of NGC 4395, known currently as th e least luminous Seyfert 
galaxy (|Filippenko fc Sargentlll989l '). 

With a distance of 4.9 Mpc, this is one of the nearest 
type 1 AGN. Other objects with similar distances are M81, 
with a distance of 3.5 Mpc, NGC 4395 (4.1 Mpc) and Cen 
A (4.3 Mpc). 



8 DISCUSSION AND CONCLUSIONS 

In this paper we presented the method of PCA Tomogra- 
phy and showed that it has differences and advantages when 
compared to traditional methods for analysing datacubes. 
With traditional spectroscopic techniques it would be diffi- 
cult to show the existence of the BLR/ AGN in NGC 4736; 
even more difficult would be to determine the position of the 
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BLR with the accuracy we obtained here. The main results 
of this method can be summarized as follows: 

(1) PCA Tomography identifies eigenvectors, ordered in 
form of principal components according to the rank of the 
corresponding eigenvalues. Tomograms are images that rep- 
resent "slices" of the data in the eigenvectors space. The 
association of tomogram with eigenvectors is important for 
the interpretation of both. One can associate spectral char- 
acteristics to image features or vice-versa. 

(2) One of the main advantages of PCA Tomography is 
the dimensional reduction. Instead of analysing tens of mil- 
lions of pixels, one compresses the relevant information to 
a dozen of eigenvectors and tomograms that present these 
data in an organized fashion. This is also important for the 
data compression and transmission. 

(3) The fact that the eigenvectors are orthogonal among 
themselves is important for their handling and interpre- 
tation. When the datacube present uncorrelated physical 
phenomena, the orthogonality may be useful for identifying 
them. 

(4) The reconstruction of the datacube with original for- 
mat, but with separated (and eventually treated) compo- 
nents associated to distinct eigenvectors allows extracting 
spectra or images in order to isolate a given feature. 

(5) Besides, by selecting the eigenvectors or tomograms 
with certain correlations or anti-correlations, one can en- 
hance features by reconstructing datacubes in original for- 
mat with tomograms normalized to unit variance. This en- 
hances the desired feature but may also increase the noise. 

(6) Various types of noise may be eliminated or corrected 
by selecting their eigenvectors and tomograms: cosmic rays, 
hot/cold pixels etc. 

(7) Flux calibration of the reconstructed datacubes is 
possible by adding the average spectrum. However, this is 
only possible directly when one takes into account all the 
components. In other situations, calibration might be possi- 
ble but could be subtle. We illustrate this by applying the 
procedure to a specific case. 

In order to illustrate the PCA methodology we applied 
it to the central region of the LINER galaxy NGC 4736. The 
dimensional reduction of the data allowed the identification 
of characteristics that were unknown in advance. For exam- 
ple, we identify a type 1 nucleus, of very low luminosity, 
displaced from the centre of the stellar bulge. By handling 
the eigenvectors and tomograms we were able to display the 
spectra and locate the BLR (Figs. [4] and [5} of this AGN 
with respect to the galactic stellar bulge (Fig. . 



Those interested in software for PCA Tomography 
may obtain it on the PCA Tomography Homepage, at 
http://www.astro.iag.usp.br/~pcatomography 
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APPENDIX A: PRINCIPAL COMPONENTS OF 
NGC 4736 

Below we show the eigenvectors and tomograms of the 8 
principal components of the nuclear region of LINER galaxy 
NGC 4736. 
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Figure Al. Tomogram of the principal component 1 and respective eigenspcctrum. 




Figure A2. Tomogram of the principal component 2 and respective eigenspectrum. 
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Figure A3. Tomogram of the principal component 3 and respective eigenspcctrum. 
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Figure A6. Tomogram of the principal component 6 and respective eigenspectrum. 
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Figure A7. Tomogram of the principal component 7 and respective eigenspectrum. 




Figure A8. Tomogram of the principal component 8 and respective eigenspectrum. 



